The ethnic density effect is the beneficial mental and physical effect observed (Becares et al, 2010) among individuals of ethnic minority living in areas of higher ethnic minority density (own ethnictiy or overall ethnic minority).
Little research has been conducted on how the ethnic density effect impacts on suicidality (Shaw et al, 2012).
Ethnic density is defined as the composition or percentage of each ethnictiy in a given area, usually within the lower super output area (area-level address (Lower Super Output Area; area of 1500 residents)). This address is recorded closest to their diagnosis date within the observation period.
Each individual in the cohort has been assigned an ethnic density (a percentage); given their own ethnicity, the area they were living at time of diagnosis, and the ethnic density in that area.
In a given LSOA (of a total of 1500 people) there are 500 Individuals of Indian descent, 250 of British descent and 750 of African descent.
A person of Indian descent living in the area would have an ethnic density score of 100*(500/1500) = 33.3%
A person of Chinese descent living in the area would have an ethnic density score of 100*(0/1500) = 0%
A person of African descent living in the area would have an ethnic density score of 100*(750/1500) = 50%
A person of British descent living in the area would have an ethnic density score of 100*(250/1500) = 16.6%
The dataset in use, is sourced from a specialist mental wellbeing trust. The trust is local to four boroughs in South East London: Croydon, Lewisham, Lambeth and Southwark. Cause of death and Ethnic Density data was linked to the existing database from census database.
This project uses clinical psychiatric data to measure ethnic density at diagnosis, and explore its effect on deaths by suicide.
Use cleaned dataset to further explore ethnic density.
Load using: load(“./EDclean.RData”) Note: it helps to name the RData file and dataset to save the same name.
load("/Users/andreafernandes/Desktop/Google Drive/Springboard_Data_Science_Course_2016/RLearning_Springboard_ProjectED/R_Springboard/Rmd_files_and_output/EDclean.RData")
Have a look at the ethnic density score data.
Summary of the Spread
EDclean %>% summarise(Min = min(ethnicdensityscore, na.rm=TRUE),
Median = median(ethnicdensityscore, na.rm=TRUE),
Mean = mean(ethnicdensityscore, na.rm=TRUE),
Var = var(ethnicdensityscore, na.rm=TRUE),
SD = sd(ethnicdensityscore, na.rm=TRUE),
Max = max(ethnicdensityscore, na.rm=TRUE),
N = n()) %>%
print()
## Source: local data frame [1 x 7]
##
## Min Median Mean Var SD Max N
## (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (int)
## 1 0 30.3 32.99887 673.9214 25.96 99 47581
# Add a Normal Curve
x <- EDclean$ethnicdensityscore
h<-hist(x, breaks=250, col="red", xlab="Ethnic Density Score", main="Histogram with Normal Curve")
xfit<-seq(min(x),max(x),length=100)
yfit<-dnorm(xfit,mean=mean(x),sd=sd(x))
yfit <- yfit*diff(h$mids[1:2])*length(x)
lines(xfit, yfit, col="blue", lwd=2)
Plotting Ethnic Density score
boxplot(EDclean$ethnicdensityscore, horizontal = TRUE, main = "Total Own-group Ethnic Density Spread")
Have a look at ethnic density score by each ethnic group
qplot(ethnicitycleaned, ethnicdensityscore, data=EDclean, geom=c("boxplot"), main="Ethnic Density Spread by Ethnicity", xlab=" ") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + coord_flip()
# returns the density data
plot(density(EDclean$ethnicdensityscore, kernel = c("epanechnikov")), main="EthnicDensity Density Plot")
ggplot(EDclean, aes(x=ethnicdensityscore)) + geom_density(aes(colour=ethnicitycleaned), alpha=1)
qplot(ethnicdensityscore, data = EDclean, colour=ethnicitycleaned) +
facet_wrap(~ ethnicitycleaned)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
From the plots, what can you say about the ethnic density distribution?
i) Of the individuals known to mental health services, more than half live in areas of low ethnic density.
ii) Individuals of the ethnic minority groups tend to live in areas of low ethnic density.
iii) The ethnic density spread is not exactly a normal curve?
iv) This could be an indication that individuals of ethnic minority do not necessarily live in areas of low ethnic density.
There’s been some research to suggest that ethnic minority living in higher areas of ethnic density are also more deprived.
Does area-level deprivation correlate with ethnic density?
qplot(ethnicdensityscore, imd_score, data = EDclean, colour=ethnicitycleaned) +
facet_wrap(~ ethnicitycleaned)
## Warning: Removed 49 rows containing missing values (geom_point).
This is a representation of the ethnic density in clinical cohort. Does this distribution also reflect in non-clinical sample?
Use existing data to compare ethnic density score spread in clinical cohort versus general population cohort.
Code to select Non clinical dataset
EDclean_GenPop <- EDclean %>%
select(LSOA_4boroughs, WhiteBrit_EDPercent, OtherWhite_EDPercent, WhiteBlackCarib_EDPercent, WhiteAsian_EDPercent,
OtherMixed_EDPercent, BritIndian_EDPercent,BritPakistani_EDPercent, BritBangladeshi_EDPercent, BritChinese_EDPercent,
OtherAsian_EDPercent, African_EDPercent, Caribbean_EDPercent, OtherBlack_EDPercent)
Basic comparisons
boxplot(EDclean$WhiteBrit_EDPercent, EDclean_WhiteBrit$ethnicdensityscore,
horizontal = TRUE,
names = c("WhiteBrit_GenPop","WhiteBrit_Clinical"),
col = c("turquoise", "blue"),
main = "White British Ethnic Density Spread")
EDclean_African <- filter(EDclean, ethnicitycleaned == "African (N)")
boxplot(EDclean$African_EDPercent, EDclean_African$ethnicdensityscore,
horizontal = TRUE,
names = c("African_GenPop","African_Clinical"),
col = c("turquoise", "blue"),
main = "Black British and African Ethnic Density Spread")
EDclean_Caribbean <- filter(EDclean, ethnicitycleaned == "Caribbean (M)")
boxplot(EDclean$Caribbean_EDPercent, EDclean_Caribbean$ethnicdensityscore,
horizontal = TRUE,
names = c("Caribbean_GenPop","Caribbean_Clinical"),
col = c("turquoise", "blue"),
main = "Caribbean Ethnic Density Spread")
EDclean_GenPopulation <- EDclean %>%
select(LSOA_4boroughs, WhiteBrit_EDPercent, OtherWhite_EDPercent, WhiteBlackCarib_EDPercent, WhiteAsian_EDPercent, OtherMixed_EDPercent,
BritIndian_EDPercent,BritPakistani_EDPercent, BritBangladeshi_EDPercent, BritChinese_EDPercent, OtherAsian_EDPercent, African_EDPercent,
Caribbean_EDPercent, OtherBlack_EDPercent)
data.frame(head(EDclean_GenPopulation))
library("tidyr", lib.loc="/usr/local/lib/R/3.2/site-library")
newtable <- gather(EDclean_GenPopulation, ethnicity, ed_score, 2:14)
A comparison of Ethnic Density Spread in Clinical vs Non Clinical Samples by Borough
qplot(ethnicity,ed_score, data=newtable, colour = ethnicity, geom = "boxplot") + facet_wrap(~ LSOA_4boroughs) + theme(axis.text.x = element_text(angle = 60, hjust = 1))
qplot(ethnicitycleaned,ethnicdensityscore, data=EDclean, colour = ethnicitycleaned, geom = "boxplot") + facet_wrap(~ LSOA_4boroughs) + theme(axis.text.x = element_text(angle = 60, hjust = 1))